LARGE DEVIATIONS OF THE EXTREME EIGENVALUES OF 
RANDOM DEFORMATIONS OF MATRICES 



F. BENAYCH-GEORGES*, A. GUIONNET*, M. MAIDA». 

Abstract. Consider a real diagonal deterministic matrix Xn of size n with spectral 
measure converging to a compactly supported probability measure. We perturb this 
matrix by adding a random finite rank matrix, with delocalized eigenvectors. We show 
that the joint law of the extreme eigenvalues of the perturbed model satisfies a large 
deviation principle in the scale n, with a good rate function given by a variational 
formula. 

We tackle both cases when the extreme eigenvalues of Xn converge to the edges of the 
support of the limiting measure and when we allow some eigenvalues of Xn , that we call 
outliers, to converge out of the bulk. 

We can also generalise our results to the case when Xn is random, with law proportional 
to e-"'^^WdA:, for V growing fast enough at infinity and any perturbation of finite 
rank. 
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1. Introduction 

In the last twenty years, many features of the asymptotics of the spectrum of large 
random matrices have been understood. For a wide variety of classical models of random 
matrices (the canonical examples hereafter will be Wigner matrices [36], or Wishart ma- 
trices [31]), it has been shown that the spectral measure converges almost surely. The 
extreme eigenvalues converge for most of these models to the boundaries of the limiting 
spectral measure (see e.g. [25] or [1]). Fluctuations of the spectral measure and the 
extreme eigenvalues of these models could also be studied under a fair generality over 
the entries of the matrices; we refer to [22] and [2] , or [T] and [3] for reviews. Recently, 
even the fluctuations of the eigenvalues inside the bulk could be studied for rather general 
entries and were shown to be universal (see e.g. [TS] or [M])- Concentration of measure 
phenomenon and moderate deviations could also be established in [20l [151 [E] • 

Yet, the understanding of the large deviations of the spectrum of large random matrices 
is still very scarce and exists only in very specific cases. Indeed, the spectrum of a ma- 
trix is a very complicated function of the entries, so that usual large deviation theorems, 
mainly based on independence, do not apply. Moreover, large deviations rate functions 
have to depend on the distribution of the entries and only guessing their definition is still 
a widely open question. In the case of Gaussian Wigner matrices, where the joint law of 
the eigenvalues is simply given by a Coulomb gas Gibbs measure, things are much easier 
and a full large deviation principle for the law of the spectral measure of such matrices 
was proved in [2] . This extends to other ensembles distributed according to similar Gibbs 
measure, for instance Gaussian Wishart matrices [21]. Similar large deviation results hold 
in discrete situations with a Coulomb gas distribution [22]. A large deviation principle 
was also established in [26] for the law of the spectral measure of a random matrix given as 
the sum of a self-adjoint Gaussian Wigner random matrix and a deterministic self-adjoint 
matrix (or as a Gaussian Wishart matrix with non trivial covariance matrix). In this case, 
the proof uses stochastic analysis and Dyson's Brownian motion, as there is no explicit 
joint law for the eigenvalues, but again relies heavily on the fact that the random matrix 
has Gaussian entries. 

The large deviations for the law of the extreme eigenvalues were studied in a slightly more 
general setting. Again relying on the explicit joint law of the eigenvalues, a large deviation 
principle was derived in [8] for the same Gaussian type models. The large deviations of 
extreme eigenvalues of Gaussian Wishart matrices were studied in [35]. In the case where 
the Wishart matrix is of the form XX* with X a.nxr rectangular matrix so that the ratio 
r/n of its dimensions goes to zero, large deviations bounds for the extreme eigenvalues 
could be derived under more general assumptions on the entries in [23]. Our approaches 
allow also to obtain a full large deviation for the spectrum of such Wishart matrices when 
r is kept fixed while n goes to infinity (see Section [7]). 

In this article, we shall be concerned with the effect of finite rank deformations on 
the deviations of the extreme eigenvalues of random matrices. In fact, using Weyl's 
interlacing property, it is easy to check that such finite rank perturbations do not change 
the deviations of the spectral measure. But it strongly affects the behavior of a few 
extreme eigenvalues, not only at the level of deviations but also as far as convergence and 
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fluctuations are concerned. In the case of Gaussian Wisliart matrices, tlie asymptotics 
of these extreme eigenvalues were established in [7] and a sharp phase transition, known 
as the BBP transition, was exhibited. According to the strength of the perturbation, 
the extreme eigenvalues converge to the edge of the bulk or away from the bulk. The 
fluctuations of these eigenvalues were also shown in [7] to be given either by the Tracy- 
Widom distribution in the first case, or by the Gaussian distribution in the second case. 
Universality (and non-universality) of the fluctuations in BBP transition was studied for 
various models, see e.g. [131 [HI [T21 E]. 

The goal of this article is to study the large deviations of the extreme eigenvalues 
of such finite rank perturbations of large random matrices. In a large deviation 
principle for the largest eigenvalue of matrices of the GOE and GUE deformed by a 
rank one matrix was obtained by using fine asymptotics of the Itzykson-Zuber-Harich- 
Chandra (or spherical) integrals. The large deviations of the extreme eigenvalues of a 
Wigner matrix perturbed by a matrix with finite rank greater than one happened to be 
much more complicated. One of the outcomes of this paper is to prove such a large 
deviation result when the Wigner matrix is Gaussian. In fact, our result will include the 
more general case where the non-perturbed matrices are taken in some classical matrix 
ensembles, namely the ones with distribution oc e~"^^^'^^^^^dX , for which the deviations 
are well known (see Theorem I2.10p . We first tackle a closely related question: the large 
deviation properties of the largest eigenvalues of a deterministic matrix Xn perturbed by 
a finite rank random matrix. We show that the law of these extreme eigenvalues satisfies 
a large deviation principle for a fairly general class of random finite rank perturbations. 
We can then consider random matrices Xn, independent of the perturbation, by studying 
the deviations of the perturbed matrix conditionally to the non-perturbed matrices. Even 
though our rate functions are not very explicit in general, in the simple case where Xn = 0, 
we can retrieve more explicit formulae (see Section [7]). In fact, even in this simple case 
of sample covariance matrices with non-Gaussian entries, our large deviation result seems 
to be new and improves on |23j . 

Our approach is based, as in [131 El E], on the characterization of the eigenvalues via 
the determinant of a matrix with fixed size : it is an r x r matrix whose entries are the 
Stieltjes transforms of the non-deformed matrix evaluated along the random vectors of 
the perturbation. We obtain a large deviation principle for the law of this characteristic 
polynomial (seen as a continuous function outside of the spectrum of the deterministic 
matrix) by classical large deviation techniques. Even though the application which asso- 
ciate to a function its zeroes is not continuous for the weak topology, we deduce from the 
latter a large deviation principle for the law of the zeroes of this characteristic polynomial, 
that is the extreme eigenvalues of the deformed matrix model. 



2. Statement of the results 

2.1. The models. Let X„ be a real diagonal matrix of size n x n with eigenvalues A" > 
> . . . > 

We perturb Xn by a random matrix whose rank does not depend on n. More precisely, 
let m, r be fixed positive integers and > 6'2 > . . . > > > 6'^+! > ... > Or 
be fixed, let G = {gi, . . . ,gr) be a random vector and {G{k) = {gi{k), . . . ,gr{k)))k>i be 
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independent copies of G. We then define tlie r vectors with dimension n 




~ 1 



(1) 



1=1 



In the sequel, we will refer to the model ([T]) as the i.i.d. perturbation model. 

Alternatively, if we assume moreover that the law of G does not charge any hyper- 
plane, then, for n > r, the r vectors G", . . . , are almost surely linearly independent 
and we denote by {Ul^)i<i<r the vectors obtained from (G")i<j<r by a Gram-Schmidt 
orthonormalisation procedure with respect to the usual scalar product on C". We shall 
then consider the eigenvalues A" > ■ ■ ■ > A" of 



and refer in the sequel to the model ([2]) as the orthonormalized perturbation model. 

U Qi, . . . , Qr are r independent standard (real or complex) Gaussian variables, it is well 
known that the law of (f/")i<i<r is the uniform measure on the set of r orthonormal vec- 
tors. The model ([2]) coincides then with the one introduced in [TT] . 

Our goal will be to examine the large deviations for the m largest eigenvalues of the 
deformed matrix Xn, with m the number of positive eigenvalues of the random deforma- 
tion. 

2.2. The assumptions. Concerning the spectral measure of the full rank deterministic 
matrix Xn, we assume the following 

Assumption 2.1. The empirical distribution ^Yl^=i^y^ of X^ converges weakly as n 
goes to infinity to a compactly supported probability fi. 

Concerning the random vector G, we make the following assumption. It allows to claim 
that with probability one, the column vectors G", . . . , G" are linearly independent and is 
technically needed in the proof of Lemma 111.11 It is also the reason why we say that the 
column vectors G", . . . , G" or [/",... , f/" are delocalized with respect to the eigenvectors 
of Xn. Indeed, the eigenvectors of X„ are the vectors of the canonical basis, whereas we 
know that with probability one, none of the entries of the G"'s (or of the f/f s) is zero. 
The i.i.d. feature of the G(/c)'s allows even to assert that all entries of each G"'s (or of 
the f/j"'s) have the same distribution. 

Assumption 2.2. G = {gi, . . . ,gr) is a random vector with entries m K = M or C such 
that there exists a > with E(e"^»=i'^'' ) < oo. In the orthonormalized perturbation 
model, we assume moreover that for any A G K''\{0}, IP(X]I=i ^i9i = 0) = 



r 




(2) 



i=l 



The law of G could also depend on n provided it satisfies the above hypothesis uniformly 
on n and converges in law as n goes to infinity. 
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We consider two distinct kind of assumptions on the extreme eigenvalues of X„. We 
will be first interested in the case when these extreme eigenvalues stick to the bulk (see 
Assumption I2.3p . and then to the case with outliers, when we allow some eigenvalues 
of Xn to take their limit outside the support of the limiting measure fi (see Assumption 



2.3. The results in the case without outliers. We first consider the case where the 
extreme eigenvalues of Xn stick to the bulk. 

Assumption 2.3. The largest and smallest eigenvalues of Xn tend respectively to the 
upper bound (denoted by b) and the lower bound (denoted by a) of the support of fi. 

Our main theorem is the following (see Theorem 16 . 1 1 and Theorem 16.41 for precise state- 
ments) . 

Theorem 2.4. Under Assumptions \2. 1\ \2.2\ and \2.3\ the law of the m largest eigenvalues 
{Xi, . . . , A^) G M™ of Xn satisfies a large deviation principle (LDP) in the scale n with a 
good rate function L. In other words, for any K G M^, {L < K} is a compact subset of 
W\ for any closed set F ofW^, 



Moreover, this rate function achieves its minimum value at a unique m-tuple (A^, . . . , Aj^) 
towards which (A", . . . , A^) converges almost surely. 

Theorem 12.41 is true for both the i.i.d. perturbation model and the orthonormalized 
perturbation model, but the exact expression of the rate function L is not the same for 
both models. As could be expected, the minimum (A^, . . . , \^) only depends on the 6*4 's, 
on the limiting spectral distribution fi of X„, and on the covariance matrix of the vector 
G, this latter dependence coming from the fact that the rate function involves a Laplace 
transform of the law of G and its behavior near the extremum will generically be governed 
by the second derivatives, that is the covariance. 

The rate function L is not explicit in general. However, in the particular case where 
Xn = 0, L can be evaluated. It amounts to consider the large deviations of the eigenvalues 
of matrices Wn = ■^G*^QGn for G„ an n x r matrix, with r fixed and n growing to infinity. 
L is very explicit when G is Gaussian but even when the entries are not Gaussian, we can 
recover a large deviation principle and refine a bound of [23] about the deviations of the 
largest eigenvalue (see Section [7]). 

2.4. The results in the case with outliers. We now consider the case where some 
eigenvalues of Xn escape from the bulk, so that Assumption 12.31 is not fulfilled. We 
assume that these eigenvalues, that we call outliers, converge: 



I23|). 




and for any open set O C M' 



m 
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Assumption 2.5. There exist some non negative integers p'^,p such that for any i < 
P'^, K — > ^t^ for any i < , A^_ +i — > £j , A"+^ — > b and — > a with 

— oo < ii < ■ ■ ■ < i~- < a < b < < • • • < < oo, where a and b denote respectively 
the lower and upper bounds of the support of the limiting measure fi. 



To simplify tlie notations in the sequel we will use the following conventions : £p-_^_i '■= d 

and ^++^^ := b. 

In this framework, we will need to make on G the additional following assumption. 



Assumption 2.6. The law of the vector satisfies a large deviation principle in the 
scale n with a good rate function that we denote by I. 

Theorem 2.7. // Assumptions \2.1\ \2.^ \2.5\ and \2.b\ hold, the law of the m + p'^ largest 
eigenvalues of Xn satisfies a large deviation principle with a good rate function L° . 



Again, Theorem 12.71 is true for both i.i.d. perturbation model and orthonormalized 
perturbation model, but the rate function is not the same for both models. A precise 
definition of L° will be given in Theorem 19.11 

Before going any further, let us discuss Assumption 12. 6[ On one side, let us give some 
natural examples for which the assumtion is fulfilled. 

Lemma 2.8. (1) If G = {gi,...,gr) are i.i.d standard Gaussian variables, Assump- 
tion \2.b\ holds with I{v) = |||f||2- 
(2) If G is such that for any a > 0, E[e"^»=i '^''^] < oo, then Assumption \2.6\ holds 
with I infinite except at 0, where it takes value 0. 



Proof. The first result can be seens as a direct consequence of Schilder's theorem. For 
the second, it is enough to notice by Tchebychev's inequality that for all L,6 > 0, 

P f max > Sn] < re-^'^"E(e^^-i l^'l') 

\ l<i<r J 

SO that taking the large n limit and then L going to infinity yields for any S > 

limsup — logP ( max l^fj/i/np > 5 ) = — oo 

n^oo n \l<i<r J 

thus proving the claim. □ 



On the other side, we want to emphasize that in the case with outliers, the individual 
LDP stated in Assumption 12.61 will be crucial. To understand more deeply this phenom- 
enon, we refer the interested reader to some couterexamples when this assumption is not 
fulfilled that are studied in [301 Section 2.3] and a related discussion in the introduction 
of [21]. 
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2.5. Large deviations for the largest eigenvalues of perturbed matrix models. 

We apply hereafter the results above to study the large deviations of the law of the ex- 
treme eigenvalues of perturbations of randomly chosen matrices X„ distributed according 
to the Gibbs measure 

d;,-(X) = i^e-'^^'-(^W)d''X 

Zin 

with A^X the Lebesgue measure on the set of n x n Hermitian matrices if /3 = 2 (cor- 
responding to G C-valued) or n x n symmetric matrices if /3 = 1 (corresponding to G 
W -valued). 

Let us first recall a few facts about the non-perturbed model. It is well known that if 
Xn is distributed according to /i^, the law of the eigenvalues of X„ is given by 

P;),,(dA„...,dA„) = ^^i^^ n |A.-A,|V"S-^(^»)ndA,. 

^'(^ l<i<j<n i=l 

We will make on the potential V the following assumptions : 
Assumption 2.9. i) V is continuous with values m RU {+C)o} and 

limmf , , > 1. 

|x|-s.oo p log \x\ 

ii) For all integer numbers p, the limit 

lim — log — 

exists and is denoted by oty j^. 
Hi) Under Fy the largest eigenvalue A" converges almost surely to the upper boundary 
by of the support of fiy 

Under part i) of the assumption, one can get a large deviation principle in the scale n"^ 
for the law of the spectral measure Yl^=i ^>^i under P^^ (see [S]), resulting in particular 
with the almost sure convergence of the spectral measure to a probability measure fXy. If 
we add part ii) and Hi), one can derive the large deviations for the extreme eigenvalues of 
Xn (see [8], and also [II Section 2.6.2J5). We give below a slightly more general statement 
to consider the deviations of the pth largest eigenvalues (note that the pth smallest can 
be considered similarly). 

One can notice that these assumptions hold in a wide generality. In particular, they 
are satisfied for the law of the GUE (/3 = 2, V{x) = x^) and the GOE (/3 = 1, 
V{x) = x'^/2) as part ii) is verified by Selberg formula whereas part Hi) is well known 
(see [U Section 2.1.6]). For the case of Gaussian Wishart matrices, we know (see e.g. 
[H p 190]) that the joint law of the eigenvalues can be written as Py^^/j with Vp^n{x) = 
|x — (/3[1 — ~ logx on (0, oo). If the ratio ^ converges to a, one can easily show 



""^Note that in the published version of [T], part in) was not mentioned but it appears in the errata 



sheet available online : http://www.wisdoin.weizmaiin.ac.il/~zeitouni/cormat.pdf 
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that the law of the largest eigenvalues are exponentially equivalent under P^^ ^ ^ and un- 
der P^_^, with V{x) = f a;-/3(l-a) log X on (0, oo), for which the assumptions are satisfied. 

Theorem 2.10. Under Assumption \2.y^ the law of the p largest eigenvalues (A" > ■ ■ ■ > 
Ap) of Xn satisfies a large deviation principle in the scale n and with good rate function 
given by 

{Er=i Jv{.Xi) + pa\,p, if xi > X2 > ■ ■ ■ > Xp, 
oo, otherwise, 
with Jv{x) = V{x) — (3 /log \x — y\dfiv{y)- 

Remark 2.11. Note that in the case of the GOE and the GUE (see [8]j, 
Jy{x) = (3 v/(y/2)2 - Idy - a\p, = -/3/2. 

Let us now go to the perturbed model. An important remark is that, due to the 
rotational invariance of the law of X„, one can in fact consider very general orthonormal 
perturbations. We make the following 

Assumption 2.12. (f/", . . . , f/") is a family of orthonormal vectors in (M")^ (resp. 
(C")''j if (3 = 1 (resp. (5 = 2), either deterministic or independent of Xn- 

Indeed, under these assumptions, X^ has in law the same eigenvalues as 
Dn + Yl\=i (^i{OnU^){PnU^)* , with Dn a real diagonal matrix with Py_^-distributed eigen- 
values and On Haar distributed on the orthogonal (resp. unitary) group of size n if /3 = 1 
(resp. /3 = 2), independent of {-Dn} U {[/",...,[/"}. Now, from the well know prop- 
erties of the Haar measure, if the t/fs satisfy Assumption I2.12[ then the 0„[/"'s are 
column vectors of a Haar distributed matrix. In particular they can be obtained by the 
orthonormalization procedure described in the introduction, with G = {gi, . . . , Qr) a vec- 
tor whose components are i.i.d. Gaussian standard variables (which satisfies in particular 
Assumption 12. 6p . 

With these considerations in mind, we can state the large deviation principle for the 
extreme eigenvalues of X„. We recall that by is the rightmost point of the support of /iy. 

Theorem 2.13. With V satisfying Assumption \2.y[ we consider the orthonormalized 
perturbation model under Assumption \2.1^/A Then, for any integer k, the law of the k 
largest eigenvalues (A^, ■ ■ ■ , A^) of Xn satisfies a large deviation principle in the scale n 
and with good rate function given by 

J^{xi,...,Xk) = ini inf ^ (xi,...,Xfc) + JP(£i,...,£p)}, 

p>Q li>--->lp>bv 

if Xi > ■ ■ ■ > Xk, the function being infinite otherwise. 

Here, L^^ is the rate function defined in Theorem \9.1\ for the orthonormalized pertur- 
bation model built on G = {gi,...,gr) i.i.d. standard Gaussian variables and Xn with 
limiting spectral measure fiy ond outliers ii, . . . ,ip. 
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3. Scheme of the proofs 



The strategy of the proof will be quite similar in both cases (with or without outliers), 
so, for the sake of simplicity, we will outline it in the present section only in the case 
without outliers (both the i.i.d. perturbation model and the orthonormalized perturbation 
model will be treated simultaneously). 

The cornerstone is a nice representation, already crucially used in many papers on 
finite rank deformations (see e.g. [HI |3]), of the eigenvalues (A^, . . . , A^) as zeroes of a 
fixed deterministic polynomial in the entries of matrices of size r depending only on the 
resolvent of X„ and the random vectors (G")i<j<r- 

Indeed, if V is the n x r matrix with column vectors [t/" ■ ■ ■ f/"] in the orthonormal- 
ized perturbation model and [G^ ■ ■ ■ G"] in the i.i.d. perturbation model, the matrix 

diag(^i, . . . ,9r) and J„ the identity in n x n matrices, the characteristic polynomial of Xn 
reads 

det{zIn-Xn) = det{zIn-Xn-VeV*) = det{zIn-Xn) det{Ir-V*{zIn-Xn)~^Ve) (3) 

It means that the eigenvalues of Xn that are no^ eigenvalues of Xn are the zeroes 
of det(/j. — V*{zln — Xn)~^VQ), which is the determinant of a matrix whose size is 
independent of n. 

Because of the relation between V and the random vectors G", . . . , G", it is not hard 
to check that, if we let, for z ^ {A", . . . , A^}, K'^{z) and G" be the elements of the set 
of r X r Hermitian matrices given, for 1 < i < j < r, by 

ir'^(.)., = ivSkM (4) 

k=l ^ 

and 

C:, = -Y.9^{k)g,{k), (5) 

fc=i 

we have (see Section H] for details): 

Proposition 3.1. In both i.i.d and orthonormalized perturbation models, there exists a 
function Pe,r defined on x which is polynomial in the entries of its arguments and 
depends only on the matrix 6, such that any z ^ {A", . . . , A^} is an eigenvalue of Xn if 
and only if 

H'^iz) :=Pe,.(i^"(^),G") = 0. 

Of course, the polynomial Pe,r is different in the i.i.d. perturbation model and the 
orthonormalized perturbation model. In the i.i.d. perturbation model, Pe,r is simpler 
and does not depend on G. This proposition characterizes the eigenvalues of Xn as the 
zeroes of the random function H^, which depends continuously (as a polynomial function) 
on the random pair (i^""(-), G"). The large deviations of these eigenvalues are therefore 
inherited from the large deviations of {K^{-),C"), which we thus study in detail before 
getting into the deviations of the eigenvalues themselves. Because K^{z) blows up when 

^ We show in section [11.2l that the spectra of X„ and Xn arc disjoint in generic situation. 
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z approaches A", which itself converges to 6, we study the large deviations of {K'^{^z\ C") 
for z away from h. We shall let /C be a compact interval in (6, oo), C(/C, Hj.) and C(/C,R) 
be the space of continuous functions on /C taking values respectively in H,, and in M. We 
endow the latter set with the uniform topology. We will then prove that (see Theorem 
15.11 for a precise statement and a definition of the rate function I involved) 

Proposition 3.2. The law of {{K^ (z)) ^^ic, C*") on C(/C, H^) x equipped with the uniform 
topology, satisfies a large deviation principle in the scale n and with good rate function I. 

By the contraction principle, we therefore deduce 

Corollary 3.3. The law of {H"-{z))zeK, on C(/C,]R) equipped with the uniform topology, 
satisfies a large deviation principle in the scale n and with rate function given, for a 
continuous function f € C(/C,M), hy 

Mf) = inf (■), C) ; (K(-), C) E C(/C, H,) x H,, Pe,.(i^(^), C)) = f{z) G /C} 
with Pot^r the polynomial function of Proposition \3.1[ 

Theorem 12.41 is then a consequence of this corollary with, heuristically, L{a) the infi- 
mum of </[b,+oo) on the set of functions which vanish exactly at a G M™'. An important 
technical issue will come from the fact that the set of functions which vanish exactly at 
a has an empty interior, which requires extra care for the large deviation lower bound. 

The organisation of the paper will follow the scheme we have just described: in the next 
section, we detail the orthonormalization procedure and prove Proposition 13.11 Section [S] 
and Section E] will then deal more specifically with the case without outliers. In Section 
we establish the functional large deviation principles for (i^"(-),C"') and if", whereas 
Section [6] is devoted to the proof of our main results in this case, namely the large devi- 
ation principle for the largest eigenvalues of X„ and the almost sure convergence to the 
minimisers of the rate function. In Section [TJ we will see that the rate function can be 
studied further in the special case when X„ = 0. We then turn to the case with outliers 
in Sections |8] and [91 Therein, the proofs will be less detailed, but we will insist on the 
points that differ from the previous case. The extension to random matrices X„ given by 
classical matrix models is presented in Section [TOj To make the core of the paper easier 
to read, we gather some technical results in Section [TT] 

4. Characterisation of the eigenvalues of X„ as zeroes of a function if" 

The goal of this section is to prove Proposition 13.11 As will be seen further, the proof 
of this proposition is straightforward in the i.i.d. perturbation model but more involved 
in the orthonormalized perturbation model and we first detail the orthonormalization 
procedure. 

4.1. The Gram-Schmidt orthonormalisation procedure. 

We start by detailing the construction of {U'^)i<i<r from (G'")i<j<r in the orthonor- 
malized perturbation model. The canonical scalar product in C" will be denoted by 



11 



{v,w) = v*w = J2k=i^'^k, and the associated norm by || • ||2- We also recall that H,. is 
the space of r x r either symmetric or Hermitian matrices, according to whether G is a 
real (K = M) or complex (K = C) random vector. 

Fix 1 < r < n and consider a linearly independent family Gi, . . . ,Gr of vectors in K". 
Define their Gram matrix (up to a factor n) 

C=[a,]i;^=i, withQ, = -{G,,G,). 

We then define 

gi = 1 and for i = 2, . . . , r, := <^ei[GkiYkl=i (6) 
and the lower triangular matrix A = [^ij]i<j<i<r as follows : for all 1 < j < z < r, 



Note that by linear independence of the Gj's, none of the gj's is zero so that the matrix 
A is well defined. 

Then the vectors Wi, . . . , Wr defined, for i = 1, . . . , r, by 

are orthogonal and the f/j's, defined, for i = 1, . . . , r, by 

Wi 

77. = 

are orthonormal. They are said to be the Gram-Schmidt orthonormalized vectors from 
{Gi, . . . , Gr). The following proposition, which can be easily deduced from the definitions 
we have just introduced, will be useful in the sequel. 

Property 4.1. For each iq = 1, . . . ,r, there is a real function Pi^, defined on H^, poly- 
nomial in the entries of the matrix, not depending on n and nor on the Gi 's, such that 

HoWJ\l = mG). 

Moreover, the polynomial function Pi^ is positive on the set of positive definite matrices. 

The last assertion of the proposition comes from the fact that any positive definite rxr 
Hermitian matrix is the Gram matrix of a linearly independent family of r vectors of W 
(namely the columns of its square root). 



Let now G be a random vector satisfying Assumption 12.21 and {G{k),k > 1) be i.i.d. 
copies of G. Let = {G{k)i)i<k<n for z G {1, . . . , r}. One can easily check that if n > r, 
these vectors are almost surely linearly independent, so that we can apply Gram-Schmidt 
orthonormalisation to this family of random vectors. We define the rxr matrices G", A^, 
the real number g" and the vectors Wi, . . . , W^, t/", . . . , of K"- as above. As an- 
nounced in Section [H these t/"'s are the Gram-Schmidt orthonormalized of the G"'s we 
used to define our model in the introduction. 
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4.2. Characterization of the eigenvalues of X„: proof of Proposition 13. iT . 



As explained in Section [3l a crucial observation (see [HI Proposition 5.1]) is that the 
eigenvalues of X„ can be characterized as the zeroes of a polynomial function of matrices 
of size r X r. This was stated in Proposition 13.11 which we prove below. 

Proof of Proposition \3. 1[ We first recall (E]), that is for z ^ {A", . . . , A"}, 

= det(^J„ - Xn) det(e) det(e~^ - V*{zln - X„)-V) 
Hence any z ^ {A", . . . , AiJ^} is an eigenvalue of X„ if and only if 

Dn{z) := det(e-i - V*{zlr. - = 0. 

We denote by G the n x r matrix with column vectors (G'")i<i<r, so that K"'[z) = 
iG*(^/„-X„)-iG. 

In the i.i.d. perturbation model, as \^ = G, Proposition 13.11 follows immediately with 

H^{z) := det(e-i - V*{zl^ - Xr,)-'V), 

which is actually a polynomial, depending on 6, in the entries of K"'{z). 

In the orthonormalized perturbation model, the Gram-Schmidt procedure makes things 
a bit more involved. 

If we denote by D the r x r diagonal matrix given hj D = diag(|| W"||2, • • • , llW^^lb) 
and S = (A")-^, then V is equal to n~^^'^GT,D~^ and we deduce that 

Dn{z) = det{e-^ - D-^J:*Kn{z)^D-^). 

Now, if we define Q = diag(g5*, . . . , g^^) (recaU ©), E = DQ, F = T.Q and H''{z) := 
det{E*Q~^E — F*Kn{z)F) then on one hand, one can check that 

D4z) = {det E*E)-'H^{z), 

so that any 2 ^ {A", . . . , A"} is an eigenvalue of X„ if and only if it is a zero of H"'. 
On the other hand, H^{z) is obviously a polynomial (depending only on the matrix 6) 
of the entries of K"'{z), E*Q~^E and F. Furthermore, E*Q~^E is a diagonal matrix 
whose z-th entry is given by {E*Q-^E)i = ^r^llC^'^lli = dr^Pi{C'') (by Property E]) 
and Fij = det[7^ with defined in (jTj). This concludes the proof. □ 

5. Large deviations for if" in the case without outliers 
We assume throughout this section that Assumptions 12. 1^ 12.21 and 12.31 hold. 

5.1. Statement of the result. 

In the sequel, /C will denote any compact interval included in [b, 00), and we denote by 
z* its upper bound. We equip C(/C, H,,) x H^, with the uniform topology which is given by 
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the distance d defined, for {Ki, Ci), {K2, C2) G C(/C, H,.) x by 

d{{K,, Ci), (A^2, C2)) = sup \\K,{z) - K2{z)\\2 + \\Ci- C2II2, 

zeJC 



where \\M\\2 = y/TF{M^ for all M e H^. 

With G = {qi, . . . , Qr) satisfying Assumption 1 2. 2 [ we define Z a matrix in such that, 
for i < j, Zij = 'glQj and A given, for any i7 G H,. by 

A(if) = logE(eT'-(^^)) . (8) 

The goal of this section is to show the following theorem. 

Theorem 5.1. (1) The law of {{K"'{z))z^jc,C"'), viewed as an element of the space 
C(/C, Hr) X Hr equipped with the uniform topology, satisfies a large deviation prin- 
ciple in the scale n and with good rate function I which is infinite if K is not 
Lipschitz continuous and otherwise defined, for K e C(/C, H^) and C G H^, by 

I(X(-),C)= sup |Tr(^y" K\z)P{z)dz + K{z*)X + CY^ -f(P,r,X)| 
where r(P, F, X) is given by the formula 

f (P, Y,X) = Ja(^-J -^-l-^P{z)dz + ^73^^ + d^(x) 

and the supremum is taken over piecewise constant functions P with values in 
and X, Y m H^. 

(2) The law of {H'^{z))z^ic on C(/C,M) equipped with the uniform topology, satisfies a 
large deviation principle in the scale n and with rate function given, for a contin- 
uous function f G C(/C,M), by 

Mf) = inf {I(Al-), C) ; (K(-), C) G C(/C, H,) x H„ Pe,.(K(^), C)) = f{z) \fz G /C} 



with Pe^r the polynomial function of Proposition \3.1[ 

Since the map {K{-),C) ^ {Pe,r{K{z),C)),^K from C(/C, H,) x H, to C(/C,M), both 
equipped with their uniform topology, is continuous and I is a good rate function, the 
second part of the theorem is a direct consequence of its first part and the contraction 
principle [161 Theorem 4.2.1]. 

The reminder of the section will be devoted to the proof of the first part of the theorem 
and the study of the properties of the rate function I, in particular its minimisers. 

5.2. Proof of Theorem EH 

The strategy will be to establish a LDP for finite dimensional marginals of the process 
((ir"(z))^6/c, C"") based on PI Theorem 2.2] (see also |S] and [12]). From that, we will es- 
tablish a LDP in the topology of pointwise convergence via the Dawson-Gartner theorem. 
As {{K"'{z))z^ic, C"') will be shown to be exponentially tight for the uniform topology, the 
LDP will also hold in this latter topology. 
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5.2.1. Exponential tightness. We start with the exponential tightness, stated in the fol- 
lowing lemma. As /C is a compact subset of {b, oo) and the largest eigenvalue A" tends 
to b, there exists < e < 1 (depending only on /C) such that for n large enough, for any 
^ G /C and 1 < i < n, z — > e. We fix hereafter such an e. 

For any L > 0, we define 

Ck,l ■■= |(i^,C) gC(/C,H,) X H,; sup||i^(2)||2 + ||C||2 <L, is ^-Lipschitz). 

We have 
Lemma 5.2. 

limsuplimsup-logP(((i^"(z)),eyc,C") e C^,^) = -oo. 

L— >oo n—^oo IT- 

In particular, the law of {{K^{z))z£}c,C"') is exponentially tight for the uniform topology 
on C(/C, Hr) X H^. 



Proof. We claim that 



i<i<r 2r 
Indeed, for n large enough. 



max Q < ^ K {((/r (^)),e;c, C") e Cyc,L} • 



\K-(z),,-K-{z%\<^/C-C- 



z — z'\ 



whereas since |C-Jp < C^iCjj, ||C"||2 < rmaxi<j<rCj" and ||i^'"(z)||2 < maxi<j<r C,". 



Now, by Assumption [221 let a > be such that C := E ^e^^^^^^i l^'l' j < oo. 

pfmaxQ>^^ < rpfcr,>^) (9) 



< rE (^e"^^ j e-n"|^ < ^c-^e"""^ < e"""^, (10) 

where the last inequality holds for n and L large enough. This gives 

limsuplimsup-logP(((K"(^)),e,c,C") G C^,i) = -oo. 

By the Arzela-Ascoli theorem, C/c^l is a compact subset of C(/C, H,.) x for any L > 0, 
from which we get immediately the second part of the lemma. □ 

5.2.2. Large deviation principle for finite dimensional marginals. We now study the finite 
dimensional marginals of our process. More precisely, we intend to show the following: 

Proposition 5.3. Let M be a positive integer and b < zi < Z2 < ■ ■ ■ < zm- 

The law of {{K''{zi)) 

i<i<A/;C") viewed as an element of satisfies a large deviation 

principle in the scale n with good rate function P^f"'^"' defined, for Ki, . . . , Km, C G 
by 

P^'-''''{K,, ...,Km,C)= sup \tt f V EiKi + YC]- Tm{Eu • • • , Hm, F) 



./ = 1 
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with rA./(Si, . . . , Em, Y) defined by the formula 



A being given by 



Proof. The proof of the proposition is a direct consequence of Theorem 2.2 of 
Indeed, let Zi be the Hr- valued random variable such that for all 1 < i, j < r, 



{Zi)ij = g^{l)gj{l) 

and we define / the matrix- valued continuous function with values in M[(*^+^)'^']^'" such 
that, if we denote by Ir the identity matrix in H^, 



v 



Ir 



Ir 



J 



Now, if (2'fc)i<fc<„ are iid copies of Zi, we denote by 

/ K-{z,) \ 



fc=i 



V 



I 



A slight problem is that ^ do not fulfill Assumption A. 1 in [3D] in the sense that 

this assumption requires that for all i, belongs to the support of the limiting measure 
IX. Nevertheless, it is easy to construct (as was done in the proof of Theorem 3.2 in [29]) a 
sequence A" such that ^ Yll^=i "^a" fulfills Assumption A.l in [30] and Ln := ^ Ylk=i f(^k) 
is exponentially equivalent to Then from Theorem 2.2 of [30], we get that L„ satisfies 
an LDP in the scale n with good rate function I^^'"'^'^' . □ 



5.2.3. Large deviation principle for the law of {{Kn{z)) z^/c, C"") . The next step is to es- 
tablish a LDP for the law of {{K"' (z)) z^ic, C"') associated with the topology of point- 
wise convergence. The following proposition will be a straightforward application of the 
Dawson-Gartner theorem on projective limits. 

Proposition 5.4. The law of {{K"' (z)) z£ic, C^) cls an element of C{1C, V\r) x equipped 
with the topology of pointwise convergence satisfies a LDP in the scale n with good rate 
function J defined as follows : for K G C(/C, H^) and C G H^, 

''-'{K{z{),...,K{zm),C). 



J{K, C) = sup sup 

M zi<---<ZM,Zi£lC 



-'m 



Moreover J equals the rate function I given in Theorem \5.1[ (l). 

Proof. Let be the collection of all finite subsets of /C ordered by inclusion. For j = 
{zi, . . . , z\j\} G i7 and / a measurable function from /C to H^, Pj{f) = {f{zi), . . . , f{z\j\)) G 

ulil 
rir- . 

We know from Proposition 15.31 that the law of {pj{K'^),C"') satisfies a LDP with good 
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rate function Moreover, one can check that the projective hmit of the family 

Hr ' X Hr is X Hr equipped with the topology of pointwise convergence. 
Therefore, the Dawson-Gartner theorem [16], Theorem 4.6.1] proves the LDP with rate 
function J. The identification of J as I is straightforward as by a simple change of 
variables, J is the supremum of 

(M-l 
J2 =(z,){K{zi+i) - K(z,)) + Jf (z„)H(0m) + CY 

- [ A fi:'s(.,) — + ^ + y] dMx) 

J \^ \Zi+i-X zi-xj Zm-X J 

over the choices of S, M, 2;. We may assume without loss of generality that zm = z*. 
Putting P{z) = J2fii^ '^i^i)^lzi,zi+i] — '^{^m), we identify J and I. Thus the proof 

of the proposition is complete. □ 

To complete the proof of Theorem 15.1( 1). we now need to show that the LDP is also 
true for the uniform topology. From Proposition 15.41 and Lemma 15. 2[ and as the topol- 
ogy of uniform convergence is finer than the topology of pointwise convergence, we can 
apply Corollary 4.2.6] and get that the law of {{K"'{z));^^ic,C"') as an element of 
C(/C, H^) X Hr equipped with the uniform topology satisfies a LDP in the scale n with 
good rate function J. 



5.3. Properties of the rate function. 



To finish the proof of Theorem I5.ir i). the last thing to check is that I(A'(-),C) is in- 
finite whenever K is not Lipschitz continuous. This is the object of this subsection (see 
Lemma 1531 (6)). together with providing further information on the functions [K, C) with 
finite I that will be useful in the sequel. 

We will consider the operator norm, given, for H G H^., by ||-f^||oo = sup{u, Hu), where 
the supremum is taken over vectors u G C with norm one. We also use the usual order 
on Hermitian matrices, i.e. Hi < H2 if and only if H2 — Hi is positive semi-definite 
(respectively Hi < H2 if H2 — Hi is positive definite). 
We recall that A was defined in ([S]). 

Lemma 5.5. (1) H t-^ A(if) is increasing, A{—H) < if H >0. 
(2) If we denote by {C*)ij = K[^gj]. Then, for any H G H^, 

A{H) > Tt{HC*). 

If we assume moreover that G satisfies the first part of Assumption \2.2\ ( existence of 
some exponential moments), we have the following properties. 



(3) There exists 7 > so that 



B := sup K{H) < 00 

H:\\H\\^<^ 
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(4) If I{K{-),C) is finite, C >0 and K{z) > 0, for any 2 G /C. Moreover, for all L, 
there exists a finite constant Ml so that on {I < L}, we have 

sup||K(z)|U < Afi, \\C\\^<Ml. 

(5) IfI{K{-),C) orJ{K{-),C) are finite, then z^K{z) is n on increasing. 

(6) For all L, there exists a finite constant Ml so that on {I < L}, we have 

and for all Zi, Z2 e /C, \\K{z2) - K{zi)\\ao < Ml\zi - Z2\ . 
In particular, K' exists almost surely and is hounded by Ml- 

If we assume now that G satisfies both parts of Assumption [KM (the law of G does not 
put mass on hyperplanes) , we then have the following additionnal properties. 

(7) For all non null positive semi-definite H G H^, 

lim A{-tH) = -oo. (11) 

t— >+oo 

(8) If I{K{-),C) is finite, then G > and K{z) > for any z ^ K. Moreover, for 
almost any 2 G /C and for any non zero vector e, there is no interval with non- 
empty interior on which the function {e,K'{.)e) vanishes everywhere. 

Proof. 

(1) The first point is just based on the fact that almost surely, Tt{HZ) > if if > 0. 

(2) The second point follows from Jensen's inequality. 

(3) The third point is due to the fact that Tt{HZ) < \\II\\ooY7i=i Idil"^ that by 
Holder's inequality, 



1 

A{H) < logE[ell^l'-^-il5'l'] < -^logE[ell^l'-"l'^»l' 



which is finite by Assumption 12.21 if ||ii||oo?" < «• 

(4) To prove the fourth point let {G,K) G {I < L}. We first show that C > 0. We 
take P, X = to get 

sup {Tr(Cy) - A{Y)} < I{K, G) < L . 

Suppose now that there exists some vector m G C such that (m, Gu) = a < 0, 
and define, for any t > 0, If = —tuu*. Then A{Yi) < by the first point and 
Tr(CFi) = -at so that for all t > 0, 

-at<TT{GYt)-A{Yt)<L. 

Letting t going to infinity gives a contradiction. The same proof holds for K{z) by 
taking P{z) = —l^yzo^ and X = —tuu* if {u, K{zq)u) = a < 0. We finally bound 
K and G. With 7 and B introduced in the third point, we define Y = ±^uu* and 
take P,X = 0. We get 

l\{u,Gu)\ <B + L 

for all vector u with norm one, that is ||C||oo < + B). Similar considerations 

hold for the bound over llii'f^;)!!^^- 
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We next prove that z^K{z) is non increasing when the entropy is finite. Let us 
prove that for any zi, Z2 E K, such that zi < Z2, K{z2) < K{zi) (dividing by Z2 — Z1 
will then give the fact that K' is negative semi-definite where it is defined). So let 
us fix Zi,Z2 G /C such that Zi < Z2. Let us fix m G C^\{0}. For all real number 
t > 0, we have, for Pt{z^ '■= ^l[zi,22](-^)'"'"* X = Y = 0, 

I{K{-), C) > tu*{K{z2) - K{zi))u - T{Pt, 0, 0). 

Note that 

r(P„ 0,0) = y A (^-t y ^-^^u*u^ d^{x) < 

by (1) of this lemma. Thus for alH > 0, 

I{K{-),C) > tv*{K{z2) - K{zi))u. 

It follows that u*{K{z2) — K{zi))u is non positive by letting t going to infinity, 
which completes the proof of this point. 

TakeP = -{z2-zi)-'^li;,^^^^]uu*, Y = -uu* max^^suppif,) J{z-x)-'^{z2-zi)-'^l[^^^^^] 
and X = to get, since then T{P, Y, X) < by the first point, 

(n, - Tt{{K{z2) - K{zi)){z2 - ziy')u) <L + re~^\\C\\^ 

where we used that Y is bounded by e:"^. This provides the expected bound by 
the fourth point. 

Consider 77 > and a non vanishing orthogonal projector p G such that H > rjp. 
For all t > 0, we have 

Since, by dominated convergence, 

lim Efe"*''^*^^] = F{G*pG = 0} = F{G G kerp} = 

(where we used Assumption 12.21 in the last equality), we have 
lim A{-tH) = lim logEfe'^^'^^-^^^] = -00. 

We already proved that K is non increasing and almost surely differentiable, so 
that K' < almost surely. Moreover, if m is a fixed vector and {u, K'[-)u) vanishes 
on an interval [zi,Z2] with Zi < Z2, taking Pt = tl[zi,z2]{z)uu* , and X = Y = 0, 
yields 

I{K{.),C) >- j ^ (7^^^*) ^'"(^) 

which goes to infinity as t goes to infinity by the previous consideration. Thus, this 
is not possible. As we have already seen that K{a) > for all a G /C, we see that 
K{a') > for a' < a unless there exists e so that (e, (a — a')^^{K{a) — K{a'))e) 
vanishes, which is impossible by the above. 



□ 
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5.4. Study of the minimisers of I. 

We characterise the minima of I as follows : 

Lemma 5.6. For any compact set /C of {b, oo), the unique minimizer ofl on C(/C, Hj.) x 

is the pair {K*, C*) given, for I < i, j < r , by 



^^d/i(A), for zelC and (C*),, = Efe]. 
z — A 



Proof. I vanishes at its minimisers (as a good rate function) and therefore a minimizer 
{K, C) satisfies for all P, X, Y, 

Ti(^J K\z)P{z)dz + K{z*)X + CY^ < T{P, X, Y) . (12) 

Now, for any fixed (P, X, Y), there exists Eq > such that for any < e < for any x 
in the support of fi we have 



^ P{z)dz + -^—X + Y 



{z — x) 



Z^ — X 



< a, 



with a given by Assumption 12.21 Therefore, there exists a constant L such that for any 
X in the support of /i 



E e 



E 1 + £ Tr 



^-^P{z)dz + -^—X + Y]Z 



,z — X] 



z^ — X 



<e'P 



so that 



T{eP,eX,eY) = eTT(^J {K*y{z)P{z)dz + K*{z*)X + C*Y^ +0{e^). 

As a consequence, for any minimizer {K, C), we find after replacing (P, X, Y) by e{P, X, F), 
using (fT2l) and letting e going to zero, that 

Tr i^'(2)P(^)d^ + K{z*)X + CF^ < Tr (^j (K*)' {z)P{z)dz + ir*(2*)X + C*Y^ . 

Changing {P,X,Y) in —{P,X,Y) gives the equality. This implies that 
C = C*, K' = {K*y a.s. and K{z*) = K*{z*) 
and therefore {K, C) = {K*, C*). □ 



6. Large deviations for the largest eigenvalues in the case without 

outliers 



We again assume throughout this section that Assumptions 12. 1^ 12.21 and 12.31 hold. 
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6.1. Statement of the main result. 



For any e > 0, we define the compact set /C^ := [b + e,e Let s := sign (111=1 — 
For X e M, we set M^(x) = {{ai, . . . , a^) G W/ai > ■ ■ . >ap>x}. 

We also denote by uj{g) := snp^^y ^^^^^lZl\^^^ ^ [0, oo] the Lipschitz constant of a function 
g. For any e, 7 > 0, and a G ^^{b + e), we put 

:= (/ G C(/C„M) : 3g G C(/C„M) with 7 < (7 < -,u;{g) < - 

and /(z) = s.g{z) - a^) ^ 
i=i J 

Note that in the latter product, the a^'s appear with multiplicity. 5*0^ will denote the set 
of functions as above but with no zeroes on /C^. We have the following theorem. 

Theorem 6.1. Under Assumptions \2. 1\ \2.^ and \2.'^ the law of the m largest eigenvalues 
(A^, . . . , AJ^) of Xn satisfies a large deviation principle in in the scale n and with good 
rate function L, defined as follows. For a = (ai, . . . , a^) G M"\ we take a^+i = b and 

lime;oinfu^>o5f , Jk, if a e Rf{b),am-k+i = b and Um-^k > b 

^, . for some k ^ \0, ... ,m — 1}, 

L{a) := < L 5 5 J ) 

lim^^o infu^>o5| Jk, if ai = a2 = ■ ■ ■ = b 

+00 otherwise. 

v 

Remark 6.2. The function L is well defined. Indeed, one can easily notice that for all 
a G such that for some k G {0, . . . , m}, Um-k+i = b and Um-k > b, the map 

e ^ M{J,cAf) ; / e U^>o5(ai,...,a„_fe),7} 
is increasing, so that its limits as e decreases to zero exists. 

Remark 6.3. Note that JfcXf) ^■^ infinite if f has more than r zeroes greater than b. 
Indeed, by definition, if JjcXf) finite, 

f{z) = PeAK{z),C) = cdet{A-K{z)) 

with a non-vanishing constant c and a self-adjoint matrix A with eigenvalues {9i^, . . . , 9~^) 
and a function K with values in the set of r x r positive self-adjoint matrices so that 
K' < by Lemma 15.51 We may assume without loss of generality that f vanishes at a 
point X > b, since otherwise we are done, so that there exists a non zero e G so that 
K{x)e = Ac. There is at most one x at which K{x)e = Ac; otherwise, {e,K'{-)e) would 
vanish on a non trivial interval which is impossible by (7) of Lemma 15.51 Moreover, 
if we let P be the orthogonal projection onto the orthocomplement of e, the function 
H{z) = det((l - P){A- K{z)){l - P)) det{PAP - PK{z)P) vanishes at x and at the 
zeroes of det{PAP — PK{z)P). But PAP and PK{z)P have the same properties as A 
and K{z) except they have one dimension less. Thus, we can proceed by induction and 
see that f can vanish at at most r points. 

The minimisers are described by the following result. 
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Theorem 6.4. // we define on (6, oo) 

H{z)=PeAK*{z),C*) 

where {K*,C*) are given in Lemma \57^ and Pq^ is defined in Proposition VJ . 1\ there exists 
k G {0, . . . , m} such that H has m — k zeroes (A^, . . . , ^^^-k) (counted with multiplicity). 
The unique point ofMJ" on which L vanishes is (A^, . . . , b, . . . ,b) and consequently 

(A", . . . , AJ^) converges almost surely to this point as n grows to infinity. 

Remark 6.5. In the case when [gi, . . . ,gr) are independent centered variables with vari- 
ance one, one can check that C* = Ir, K*{z) = J j^dfi{x).Ir and 

SO that we recover [HI Theorem 2.1] or (TUl Theorem 1.3]. 
6.2. Preliminary remarks and strategy of the proof. 

Let us first notice that at most m eigenvalues of Xn can deviate from the bulk since by 
Weyl's interlacing inequalities (see e.g. [271 Section 4.3]) 

which converges to 6 as goes to infinity. 
Secondly, let us state the following lemma. 

Lemma 6.6. The law of the sequence (A", . . . , A^) of the m largest eigenvalues of Xn is 
exponentially tight in the scale n. 

Proof. Let us define i?„ := Xn — X„ and denote by ||-Rn||oo the operator norm of the 
perturbation matrix i?„,. Note that for all fc, 

Afc ^ ||-Rn||oo ^ A^ < A^' + ||-Rn||oo- 

Since for any fixed /c, the non random sequence A^ converges to 6 as n tends to infinity, 
it suffices to prove that 

limsuplimsup — logP(||i?ri||oo > = — oo. (13) 

L— >oo n— >oo n 

For the orthonormalized perturbation model, since ||-Rn||oo = max{^i, — ^r}? (I13p is clear. 
In the i.i.d. perturbation model, we have, for ^ := maxi<j<r \6i\, 

It implies, by Tchebychev's inequality, that 
P(||Pn||oo >iv) <e-^E 
which allows to conclude by Assumption 12.21 □ 



exp a^^\gi{k){' 



k=l 2=1 



naL 

e~—E 



exp a 



i=l 
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As the law of (A", . . . , AJ^) is exponentially tight, the proof of Theorem 16.11 reduces to 
establishing a weak LDP. In virtue of [TBI Theorem 4.1.11] (see also [H Corollary D.6]), 
this weak LDP (and the fact that L is a rate function) will be a direct consequence of 
Equation ( IT8|) and Lemma 16.91 below. The fact that L is a good rate function is then 
implied by exponential tightness [161 Lemma 1.2.18]. 



6.3. The structure of _ff". 



From Proposition 13. 1[ we know that the A"'s are essentially the zeroes of H^. However, 
i7" could a priori have other zeroes than these eigenvalues or take arbitrary small values. 
To control this point, we need to understand better the structure of H'^. Let 



^fc,7 '■— |/ ^ C(/Ce,M) : 3p polynomial of degree k with k roots in /C^ 

and dominant coefficient l,g E C(/C£,M) with •y < g < —,uj(g) < — and f(z) = s.g(z)p(z] 

7 7 

and 

0<k<m 

We intend to show the following fact 

Lemma 6.7. For any e > small enough, there exists a positive integer no{e) , L{e) > 
and a sequence of random functions {gn) such that for any z E JC^ and n > nQ{e), 



H'^iz) 
with 



■^111=1 Iki'^^rll2 9n{z) YliLii^ ~ K) orthonormalized perturbation model, 

s gn{z) YYiLii^ ~ -^r) i.i.d. perturbation model, 

s = {-\Y-^, L{e)<g^<^ and a;(^?„) < -1-. (14) 



In particular, for any e > 0, 

" ~ ' e [Cl^r 1 = -00. (15) 



limsuplimsup-logP ( ( f\{z - XI)-^H''{z)\ 



and 



limsuplimsup -logP {H'' e (C'^Y) = -00. (16) 

74,0 n— ^00 IT- 

Proof. Let us define the random sequence 

111=1 Ikr^nii i'^ the orthonormalized perturbation model, 

in the i.i.d. perturbation model. 
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Going back to the proof of Proposition l3.H one can easily see that, for any z ^ {A", . . . , A^}, 

r ( ^ 

H'^iz) =Cnl[ det(^J„ - det Uj„ - X„ - J]] ^.t/f (f/f ) 

i=l \ 1=1 

We can rewrite the above as H"-{z) = Cngn{z) YYlLii^ ~ K) with 



z-X? 



Now, for e > fixed, we shall bound Qn and its Lipschitz constant on /C^. 

As /Cg is compact and the A" belong to a fixed compact, for e > small enough, for 

\- < 2 

« — e 



any i and 2; e /Cg we have 2; — A^ < - and |A"| < - so that 



n rn n ^ 

< (^"- - = E - E ^" ^ 2m-. (17) 

i=m+l i=l i=n—m 

We choose no(£:) such that for n > no{e) and any i and 2 G /C^ we have | < 2 — A" so 
that as 2 - A" < -, 

\n \n ^ \n 2 

-2-A^~ 2-A^- 4' 

Now, using Weyl's interlacing properties, we have for any i > m + 1, 
Ar < Ar_™, so that X- - X- > -{Xtm. - AD- 

For < X < 1 — ^, log(l — x) > —-^x, so that we finally get by ffT7|) . 

i=l i=l 

By very similar arguments (using log(l + x) < x for x > 0), one can also check that for 
any n > no{e), 



^?n(^)<ni^^r' [-) 



4- 2m 



The proof of the uniform equicontinuity of Qn on /C^ is left to the reader as the arguments 
are very similar since z— )-(z — A")~^ is uniformly continuous on /C^ for n large enough. 

To prove f|T6|) and f|T5|) . it is therefore sufficient to prove that, with probability greater 
than 1 — e~'^"' for somme c > 0, we have that c„ and are bounded which is a direct 
consequence of Lemma [11.1^ and that, A^ < e^^ for small e, which is proved in Lemma 
16:61 □ 

The main application of the previous Lemma will be the following continuity properties 
of the zeroes of functions in C^. 

Lemma 6.8. Let e > be fixed, 7 > small enough, and k ^ N be fixed. Let «?>■■■> 
a° G ICe and fo{z) = ho{z) Y[t=ii^ ~ '^i) ^ ^k-y given. Then, for all 5 > there exists 
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6' > SO that 



7 



m 

C < z H- h{z) \ \{z — ai) : h G C'g , max \ai — a°| < 5, maxcti < b + 2e 

i=l 

Proof. This amounts to show that if G C,^ is a sequence converging (for the uniform 
topology on /Ce) to / G Cl^^, m — k zeroes of the functions will be below b + 2e and 
the others will converge to the zeroes of /. Indeed, if we take a sequence /" G C^, we 
can always denote it /"(-z) = h"-{z) YYiLii^ ~ '^F) (with possibly some a" G {b + e/4:,b + 
Se/A) if & Cl^ with < m), as this amounts at the worst to change h and take 7 
smaller. Then, the crucial point is that /i" is tight by Arzela-Ascoli theorem so that we 
can consider a converging subsequence. As the a" belong to [b, l/e], we can also consider 
converging subsequences. Thus, f"' converges along subsequences to a function / with 
f{z) = h{z) YYiLii^ ~ ^i) with ftj G [6, l/e]. But then we must have f = f which 

allows in particular to identify k limit points with the zeroes of /, the others being below 
b + 2e. □ 

6.4. Core of the proof. 

First, from what we said in the preliminary remarks and the fact that the are 
decreasing, we obviously have that if a ^ 11^™ (^)) one has 

lim sup lim sup — log P ( O { | A" — | < 5} ) 

= liminf liminf-logP I Pi UX" - aA < 6}] 

SiO n^oo 77, \ I ' / 

\l<j<m / 



-00. (1^ 



The weak LDP will then be a direct consequence of the following lemma, with k the 
numbers of eigenvalues going to b, 

Lemma 6.9. Let a G M^" and k between and m such that am~k+i = ■ ■ ■ = am = b and 
am-k > b if k < m. We have 

lim lim sup lim sup- log P I Pi {|A"-ai|<(5} Pi {K < b + e}] 

\l<«<m— fe m—k+l<i<m / 

= lim liminf liminf -log P I Pi {|A"-ai|<5} Pi {A" < 6 + e} | = -L(a), 

£4,0 (54,0 n-s^oo ?7 \ ' ' ' ' / 

with the obvious convention that nm-fc+i<j<m{-^r < b + e} = Q if k = 0. 

Proof. Let 6 and e be positive small enough constants so that am-k—S > b+2e. In particu- 
lar, n™f a.+S] C /C,. On the set r],<^<m-k{\K-cy^\ < 5} an-k+i<^<n.{K < b+e}, 
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for alH < m — k, A" is in /C^. On the other hand, for n large enough, {A^^, . . . , A^}n/Ce = 0. 
Therefore, A" ^ {A", . . . , A^} for z G {1, . . . , m — A;} and, by Proposition I3.1[ is a zero of 

Let us next prove the large deviation upper bound and fix ai > 0^2 > ■ ■ ■ > o^m-k > b. 
A function f E which vanishes within a distance 6 of (ai)i<i<m-fc with 6 < am~k — b 
belongs to the set 

Bly,s ■■=\f^ C(/C„ R):3ge C(/C„ R) with -<g<i, uj{g) < - 
\ 7 7 

m—k "\ 

and f{z) = s.g{z) — f^i) with Wi < m — k, I3i e [oj — S,ai + S] > (19) 

i=i J 

Moreover, writing H^{z) = h^{z)YYiLi{z ~ ^7) by Lemma 16.7^ clearly iJ" belongs to 
-^a,7,<5 soon as for some e' < e and 7' ■ (e')™ > 7, /i" G Cg y and ni<i<m-fc{l-^? - "il < 
^} nm-A:+i<j<m{'^r < ^ + £^ — holds. As a cousequcuce, we can write 

P( fl {\\--a,\<5} fl {\l<b + e-e'}\ 

\l<i<m—k m—k+l<i<m / 

<P(i/"G5;,,,)+p(/l"G(Q,V)j. 

Then, by [161 Lemma 1.2.15], 

limsupilogPj fl {\K-c^i\<S} fl {Ar<6 + £-£'}j 



< 



max jhmsup - logP (if" G 5^,^,5) ; limsup - logP (/i" G (Q'y)') | ' (^0) 



Moreover, -8^,7,5 is a closed subset of C(/Ce,M). Indeed, if we take a converging sequence 

= sgn{z) W^S^{z — /Sf), since the /9",n > belongs to compacts and the gn.n > 
are tight by Ascoli-Arzela's theorem, we can always assume up to extraction that gn and 
< i < m — k converge so that the limit of belongs to i?^ ^ ^. 

Since Jk:^ is a good rate function, (-Ba_^_5)<5>o is a nested family and ^&>oBl^,^^^ = 
Sf N , Theorem 15. II gives with [161 Lemma 4.1.61 that 

lim sup limsup- log P (i/" G 5^,^,5) <- ^ inf J/c,. (21) 

(54-0 n->-oo ''T' ' ' •^fai,...,Q„_j.),7 

Taking 7' = 7q small enough, f|T5|) and f l20l) give for 7/(5')™ < 7q, 

lim sup limsup- log P ( f {|Ar-ttil<^} f < ^ + ^ " 

< - inf <- „ inf ^/c^- 

l.-"."m-fc) 



5"? X U^>o5'f ^ 
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We can finally take 7 = (nothing depends on it anymore), e — e' going to zero, as the 
left hand side obviously decreases as 5 — e' decreases to and, as we already mentioned 
it in Remark I6.2[ the right hand side increases as e decreases to 0. 

We turn to the lower bound, which is a bit more delicate. Let us again consider 5 and e 
small enough so that r^~^\ai — 5, ctj + 5] C /Ce. As Jk.^ is a good rate function and S%^^ is 
closed, for all 7 > 0, the infimum inf^e J^^ is achieved, say at f^^- To complete 

the proof, we need the following lemma, based on the structure of B.^ and whose proof is 
a direct application of Lemma 16.81 

Lemma 6.10. Let e, 7 he fixed and small enough. There exists 60 such that for any S < 6q, 

there exists 6' > such that for any n, 

{i/"Gq}n|sup|//"(x)-4''^(x)|<5'|c fl {\K-c^^\<S} n W<H2£} 

^ ^ l<i<m—k m—k+l<i<m 

To prove the lower bound in Theorem 15. we may assume without loss of generality 
that 

J := lim inf J/c^ < 00. 



£4,0 u-,>oSf , 



Let ?7 > be fixed. As 



inf J^^ = inf inf J^^ = inf J^^(/^'^), 

U-Y>0'!'f X 7>0 Sf , 7>0 

we can choose £,7 small enough so that Jjc^ifj'^) ^ J + V- By (ITB]) . there exists L{j,e) 
going to infinity as 7, e go to zero so that for n large enough. 

We choose 7, e small enough so that L{e, 7) > J + 2//. 

Lemma [6. lUI implies, that for 6 < 60, for 6' small enough, 77 > 0, for n large enough. 



P 



( n {l^r-«d<5} n {K<b + 2e}\ 

\l<i<m—k m—k+l<i<m / 

> p f sup \H-{z) - f!;'%z)\ <6')-F (i/" G (q)^) > i 



the last inequality following from Theorem 15.11 (|2]) . As 77 can be chosen as small as we 
want, we conclude by taking first n going to infinity, and then 6, e, 77 to zero. 

□ 

6.5. Identification of the minimizers. 



We prove Theorem 16. 4[ which is straightforward. Since L is a good rate function, it 
vanishes at its minimizers (A^, . . . , A* J G M"^(6). Putting Aq = 6 + 1, we know that there 
exists < k < m such that A^_;j, > b and )^m~k+i ~ b- From the definition of L, for any n 
large enough such that b + ^ < A^„;., we can find a function /„ defined on ICi vanishing 
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at (A*, . . . , such that J/c^ (fn) < ^- From the definition of J/c and the fourth and 



sixth point of Lemma [5 .Sj all the functions /„ are in a compact set of oo), M) so that 
we can find a function / vanishing at (A*, . . . , X^-k) ^^^^ JicAf) = e > 0. 

But the latter also implies that f{z) = Pe^r{K{z)^C) with {K,C) minimising I, that is 
{K,C) = {K\C*)hY Lemma EE 

7. Large deviations for the eigenvalues of Wishart matrices 

In this section, we study the i.i.d. perturbation model when X„, = 0. More precisely, 
we consider G = {gi, . . . ,gr) satisfying Assumption \2.2\ n x r matrices Gn whose rows 
are i.i.d. copies of G, a diagonal matrix 6 = diag(6'i, . . . , 6*,.) and we study the large 
deviations of Wishart matrices Wn = -^GnQG*^. This matrix has zero as an eiganvalue 
with muliplicity at least n — r and we refer in the whole section to the r eigenvalues of 
Wn that can be non-zero as "the eigenvalues of Wn" ■ The large deviations for the largest 
and smallest such eigenvalues were already studied in [23] in the case when 9 = (1, . . . , 1) 
and the Qi^s are i.i.d. 

Proposition 7.1. Assume that G satisfies Assumption \2.2[ Let = diag(6'i, 6'2, . . . ,0^) 
be a diagonal matrix with positive entries. Then, the law of the eigenvalues ofWn satisfies 
a large deviation principle in the scale n with rate function which is infinite unless ai > 
■ ■ ■ > ar > and in this case given by 

L{ai, . . . , Or) = inf{ J(C) : (ai, . . . , ar) are the eigenvalues of 0^2(70^2 }^ 



Note that the previous proposition could also have been deduced directly from Cramer's 
theorem and the contraction principle. 

The Gaussian case allows an exact computation, given by the following 

Corollary 7.2. Assume that G = {gi,...,gr) is a Gaussian vector with positive def- 
inite covariance matrix R. Let = diag(^i, . . . , be a diagonal matrix with positive 
entries. We denote by < ri(0) < r2(0) < . . . < rr(0) the eigenvalues of the matrix 
Q~^^^R~^Q~^^^ in increasing order. 

Then, the law of the eigenvalues of Wn satisfies a large deviation principle in the scale n 
with rate function which is infinite unless ai > 02 > . . . > > and otherwise given by 



i=l 

In the particular case when the entries are i.i.d. standard normal, the above rate function 
can be rewritten 



with 



J{G) = sup {Tt{GY) - logE[e<^'^^>]} . 





Now, by a straightforward use of the contraction principle, we can derive some results 
about the deviations of the largest eigenvalue. This problem was addressed in particular 
in [23]. The following corollary holds for the Gaussian case. 
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Corollary 7.3. Under the assumptions of Corollary \7.S^ the law of the largest eigenvalue 
satisfies a LDP with good rate function 



l{xri{Q) - 1 - log(xri(e))) if x > 

j ELi(^n(0) - 1 - log(a;r,(e))) zf ^ < ^ < ;7M^ 
with the convention that rr+i(0) = oo. 



In particular, in the i.i.d. standard case when G = diag(l, . . . , 1), we have 

- I logx if X > 1, 



Lm.n.x \ X] 




logx if X G (0, 1) 



and this allows to retrieve [231 Corollary 2.1] (note that a direct proof based on the for- 
mula for the joint law of the eigenvalues is then also available). This is in agreement with 
the fact that as r goes to infinity, we expect the deviations below one to be impossible in 
this scale. 



In the general case, we have 



Corollary 7.4. Under the assumptions of Proposition 7.1, the law of the largest eigen- 
value ofWn satisfies a large deviation principle in the scale n with a rate function Lmax{(^) 
which satisfies, for any a G M, 

Lmax{<^) = inf{L(ai, . . . , a^) : maxttj = a} 

> inf sup{to-logE[e*l<^'®^">l']} =: /,,e(a) 

INIl2=i teR 



From there, one can easily improve the upper bound on the probability of deviations 
of the largest eigenvalue of [221 Theorem 2.1] : 

Corollary 7.5. Assume that G satisfies Assumption \2.^ and that the gi's are i.i.d. with 
mean and variance 1. Let O = diag(^i,^2,- ■ ■ ,0r) be a diagonal matrix with positive 
entries, with 6i > 62 > ■ ■ ■ > Or- Then we have that, for a > 61, 

lim -logP(Amax > «) = -/r,e(")- 

n— >oo n 

Note that when a > 61, Ir^e{(^) = inf [0,00) -^max and in particular /,. is not necessarily 
lower semi continuous. We refer to [23] for more properties of /,.^0, related results and 
conjectures. 

Proof of Proposition [7711 In the case where X„ = 0, we can apply Theorem 16. II with 
PeAK{z), C) = det{z - G^CG^) and 1(C) = J(C). Hence, for ai > aa > ■ ■ ■ > > 0, 
L(a) is the infimum of J over the nonnegative Hermitian matrices C such that G2CG2 
has spectrum (ai, ...,«,.). □ 

Proof of Corollary [721 In this case, logE[e<^'^^>] equals logdet[(i?-i - 2Y)-^)R-^ 
if — 2Y > 0, and is infinite otherwise. A classical saddle point analysis shows that 
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the supremum in J is taken at 
which yields 

J(C) = ^Tr(Ci?^^ - /) - ^logdet(Ci?-^). 

We finally take the infimum over C so that O^C*0^ = ^^^^ cn^ie* for some orthonormal 
basis (ONB) (ej)i<j<r. This gives 

1 1 ^ r 



2 ^ ' ' 2 

1=1 i=l 



□ 



Proof of Corollary 17.41 We only need to take, in the definition of J(C), Y = tvv* if 
C has eigenvector v for its largest eigenvalue to get a lower bound on J{C), and thus on 
L. □ 

Proof of Corollary 17.51 The inequality in Corollarv 17.41 gives the upper bound and the 
lower bound is obtained by the same proof as in [23], that is by noticing that 

P(Amax > a) = P I sup (X, WnX) > a] > sup WnX) > «) 

Vli^ll2=l J \\x\\2 = l 

and that for fixed x, {x, Wnx) = n^^ XlJ=i((^7 ©^^n))^ is a sum of i.i.d. random variables 
so that Cramer's theorem apply. By arguments as in [23], one can also check that J^^e is 
increasing on [^i, oo), which concludes the proof. □ 

8. Large deviations for H"" in the presence of outliers 



We now go to the proof of the LDP in the presence of outliers, that will be stated in 
details in Theorem 19.11 The proof follows the same lines as in the case without outliers 
and starts therefore with the study of the deviations of H"^. 

Let /C° := ljr=i['^«'^«] ^ compact subset of (6, oo) \ {if , . . . , We equip again 
C(/C°, Hf.) X \-\r with the uniform topology. Hereafter, we denote by li = if ioi 1 < i < 
and ii = for p+ + 1 < i < J9+ + p-. 

We recall that K'^{z) and C" were defined in (j3]) and (E]) respectively. 

Theorem 8.1. We assume that Assumptions \2. 1\ \2.^ \2.5\ and \2.6[ hold. 

(1) The law of {{K'^ (z)) zeic° , C"') , viewed as an element of the space C(/C°, H,.) x 
endowed with the uniform topology, satisfies a large deviation principle in the scale 
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n with rate function P. For K G C(/C°, H^) and C G H^, P(/'^(-),C) is infinite if 
z — )■ K{z) is not uniformly Lipschitz on /C°. Otherwise, it is given by 



P++P 



1=1 



where the infimum is taken over the families Ko{-) G C(/C°, H^), Co, Li, . . . , Lp+^p- G 
Hf. satisfying the condition 

Koi-) + Y T^.^i = ^^(■) Co + = C (22) 

i=l * i=l 

anc? with 

T*{K{-),C)= sup l^r^y i^'(2)P(^)d2 + ^J^(6i)Xi + CFj 

t/ie supremum being taken over piecewise constant P with values in H,., X = 
(Xi,...,XpJ G (H,)Po an(iF G K,. 
(2) r/ie law of {H"'{z))z^ico on C(/C°,M) equipped with the uniform topology, satisfies 
a large deviation principle in the scale n with rate function given, for a function 
/gC(/C°,M), by 

J°„(/) = inf {F(K(-), C) ; (K(-), C) G C(/C°, H,,) x H„ PeA^iz), C) = f{z) Vz G /C°}. 

Note that the function T* is well defined because if K is uniformly Lipschitz on /C°, 
then so is any Kq satisfying the compatibility condition fl22|) . so that Kq almost surely 
exists. 



Under the second assertion of Assumption I2.6[ we have the following straightforward 
application of the contraction principle. 

Lemma 8.2. Let Z\ be the W^-valued random variable such that for 1 < i < j < r, 
(Zi)jj = gi{l)gj(l). Under Assumption \2.6[ ^ also satisfies a large deviation principle 
in the scale n with a good rate function I^^\M) = inf{J(f ) : vlVj = M^, 1 < i, j < t}. 

The proof of Theorem 18.11 follows the same lines as that of Theorem 15. except that 
the LDP for finite dimensional marginals for our process is described by Theorem 3.2 of 
[29] instead of Theorem 2.2 of [30]. It is based on the large deviations for and that 
can be, up to a re-indexation, shown to be exponentially equivalent to 

k=p++p-+l k=l " 

which satisfy a LDP by independence of the gi{k), and large deviations of each parts 
by Proposition 15.31 and Lemma 18.21 The corresponding rate function will be denoted 
by (/|,j''"'^*^)°. To define this new rate function, we first extend in an obvious way the 
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definition of J|J^' "'^" for ZiS in /C°. Tlien one can define, for Ki, . . . ,Km,C G H,., and 
zi, ...,zm& 

where tlie infimum is taken over families 

{C, Kq^i, . . . , Kq^m, Li,. . . , Lp++p-) E (HrY^^^^^ 
under tlie condition that for all 1 < j < M, 

i=i i=\ 

By Dawson-Gartner Theorem, we deduce that ((-^'"■(z))^^/^") C*"') satisfies a LDP for the 
topology of pointwise convergence with good rate function 

r{K, C) = sup sup {Il}'-'''riK{z,), K{zm), C). 

M zi<...<ZM,Zj£K.° 

Since exponential tightness is clear, this LDP can be reinforced into the uniform topology. 
We then have to check that I" = 3°. 



From the definition of 1°, the first thing to check is that on the event {3°{K{-), C) < oo}, 
K is Lipschitz continuous on K,°. The proof is similar to that of Lemma [5.51 as. once the 
Li are given, K is Lipschitz on /C° as soon as Kq is. 

We now suppose that K is Lipschitz continuous on /C° and we want to identify the two 
rate functions. By mimicking the proof at the end of Section 15. 2[ one can easily show 
that for K is Lipschitz continuous on /C°, 

sup sup r;r''''{K{z^),...,K{zM))=T*{K,C). (23) 

M zi,...,zm 

Now, in order to achieve this identification, we have to check that we can switch the 
supremum over M and the Zj's and the infimum over the admissible simultaneous decom- 
positions of K and C. It is clear that, 

p+ +p~ 

r{K,c)<r*{Ko{-),Co)+ 

i=l 

for any admissible choice of Lj, and therefore J° < 1° after optimisation. We now need 
the converse inequality. By definition of J°, if it is finite, then for any positive integer p, 
there exists M(p) and zi, . . . , zm(p) such that 

3\K, c) > {i2;rf''^r{K{z^, . . . , ir(.M(p)), c) - 



'We just have to be careful in the rewriting to put one border term for each interval involved in /C°. 
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Now for each zi, . . . , zm(p) we choose an admissible decomposition (according to ( 122|) ) of 
K so that 

i=l ^ 

Moreover, for each M and choices of 2i < ■ ■ ■ < zm, 

Q'-'^^'{K{z,), . . . , K{zm)) = T*{K^^r''-', C) 

withi^-'-'^-(z) = E£il[....,.]i^W. 

By definition, since J^^-* and F* are good rate functions and as for all i, and 

i^Mip)), C) are uniformly bounded, it implies that the arguments 
are tight and we can take a converging subsequence. Let Kq and Lj be limits along a 
subsequence, we get 

r{K,c) > r(A-o(-),c) ^^^H^*) - - 

which insures that J°{K, C) > 1°{K, C). This completes the proof of Theorem 18. II 



9. Large deviations principle for the largest eigenvalues in the case 

WITH outliers 

We now state the main theorem of this section, namely an analogue of Theorem 16. 1[ 
For any e, p small enough, we define the compact sets 

p+ 

)Cl^:=[b + e,e-']\[jiii-p,ii + p) 

i=l 

and /C° := [b + e,e~^. We also define the set {£} := {it, . . .,ip+,b}, and for z ^ {£}, 

sign of the product ni=i 

For any e, p,'j > 0, and a G ^^{b + e), we put 

:= |/ e C(/C°„M) : 3^ G C(/C°„M) with ^<g< ^,uj{g) < ^ 

and f{z) = s.R{z).g{z) ^{z - a^) \ 
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We also denote by 

^fc 7 ° ■~ {f ^ ^) '■ polynomial of degree m + — k with m + — k roots in 

and dominant coefficient l,g ^ C(/C° ,]R) with ■y < g < —,u(g) < — 

7 7 

and f{z) = s.g{z).R{z).p{z)} 

and 

/^e,p,o I I ^e,p,o 

0<k<m.+p+ 

Then the main statement of this section is the following. 

Theorem 9.1. Under Assumptions \2. 1\ \2.5\ and \2. b\ the law of the m + p'^ largest 
eigenvalues (A", . . . , AJ^^p+) of Xn satisfies a large deviation principle in M™+p^ with good 

rate function L°. For a = {ai, . . . ,am+p+) G we take a^+p+^i = b and L" is 

defined as follows : 



L\a) 



lim^^olimp4oinfu^>o5,^^'° ^ z/a G Q;^+p+_fc+i = 6, 

Um+p+^k > b for a k e {0,..., m}, 

oo otherwise. 



Even though the rate function L° is not very explicit, we show below that it must be 
infinite if Horn's inequalities are violated. 

Remark 9.2. Recall that the eigenvalues (A")i<j<„ of the sum of two Hermitian matrices 
with eigenvalues (A")i<j<„ and 9 := (^i, . . . , 0, . . . , 0) satisfy Horn's inequalities and are 
characterised by the fact that they satisfy such inequalities (see [33] for details). Assume 
that A := (Ai, . . . , \m+p+) is at distance of the bulk and of the outliers which is bounded 
below. We claim that the rate function L°{X) is infinite if (A, 9) do not satisfy the Horn 
inequalities. Indeed, if L°{\) is finite, (Ai, . . . , Am+p+) are zeroes of a function f which 
can be written 

f{z)=PeAK{z),C). 

with I°{K[-),C) finite. It implies that there exists sequences A" G G C" so that 

A" satisfies Assumptions \2.1\ \2.5\ and \2.6\ and 



1=1 ^ 1=1 

converge to K{z) (uniformly away from the bulk and the outliers) and C respectively. By 
definition, there exists a constant c such that 

n / ^ 

PeAK''iz),C^) l[iz - AD = cdet z - dmg{X^) - 

i=l \ i=l 

with Ui = gi in the i.i.d. perturbation model andui the Gram-Schmidt orthonormalization 
of the vectors gi in the orthonormalized perturbation model. Hence, the function fn{z) = 
PQ^r{K^{z), C") vanishes at the eigenvalues (A") of the sum of the two Hermitian matrices 
diag(A'") and Yll=i ^i'^i'^l (note that we can assume without loss of generality that its zeroes 
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are different from A" by Lemma Therefore, (A", A", 6*) satisfy Horn's inequalities 

by |33] . Since the (A") are bounded, they are relatively compact and we see that the limit 
points (Ai, . . . , Am+p+) of (A", . . . , AJ^^p+) which stay away from the bulk and the outliers 
are the zeroes of f . By passing to the limit in Horn's inequalities, we thus deduce that 
if the vector (Ai, . . . , \m+p+) has finite L" -entropy, and is away from the bulk and the 
outliers, (A, i, 9) satisfies Horn's inequalities. It would be interesting to have a direct proof 
of this fact. 

9.1. Proof of Theorem Em 

We now prove Theorem 19.11 following roughly the same lines as for Theorem 16.11 

As in the proof of Theorem 16. 1[ the crucial point is to use Proposition 13. 1[ In the 
sticking case, ii z G /C^, for n large enough, the condition that z should not belong to 
the set of eigenvalues of X„ was very easy to check. Here, we need to make sure that 
the eigenvalues are not exactly equal to the outliers to use our strategy. We show the 
following 

Lemma 9.3. Assume that the eigenvalues A", . . . , AJ^ of Xn are pairwise distinct and that 
Assumptions \2. 1\ and \2. 51 hold, then and X^ have no eigenvalue in common for almost 
allG. 

The proof of this lemma is postponed to Appendix lll.2[ We shall therefore give the 
proof of the Theorem when the eigenvalues of X„ are distinct. This is however sufficient 
to get the LDP without this hypothesis due to the following Lemma. 

Lemma 9.4. Let X^ satisfy Assumptions \2. 1\ and \2.5[ Then, there exists a sequence Xn 
of matrices with pairwise distinct eigenvalues satisfying Assumptions \2.1\ and \2.5\ such 

that, if we define X„ be the perturbation of Xn by the i.i.d. or the orthonormalized 
vectors constructed on the law fin = fJ' * In of G + e{n)A with A r independent standard 

nornal variables and e{n) going to zero with n fast enough, then, with (A")i<m the extreme 

eigenvalues of Xn, 

1 / ^ ^ 1\ 

limsup — logP max |A" — A"| > — = — oo. 

n-j.oo n \l<i<m n J 

Proof. We take Xn to be the matrix with the same eigenvectors as Xn and the same 
eigenvalues except for those which are sticked together which we separate by an arbitrary 
small weight Wn < much smaller than the minimal distance between two distinct 
eigenvalues of X„, so that the eigenvalues of Xn are distinct and the operator norm of 
Xn — Xn is bounded above by w„. It is straightforward to verify Assumptions 12. l l and 12.51 
for Xn- Now, if we add the same perturbation to X„ and Xn respectively, their eigenvalues 
will differ at most by Wn almost surely. Then adding a Gaussian vector of variance e{nY 
to G will not change the eigenvalues by more than \J s{n) with probability greater than 
1 — e"^^*^^ ^" as the empirical covariance matrix of this additional term is bounded by 
C\Je{n) with such a probability. We conclude by choosing e{n) such that \/ s{n) <l/n. 
□ 
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Lemma 19.41 means in particular the random variables (A")i<m and (A")j<m are expo- 
nentially equivalent and [TB| Theorem 4.2.13] asserts that a large deviations principle for 

the extreme eigenvalues (A")j<m of Xn entails the large deviations principle for the law 

of (A")j<m with the same rate function. Therefore, the proof of Theorem 12.51 can be done 

for the eigenvalues of Xn, the main advantage being that, from Lemma [9.31 above . we get 

that Xn and Xn have almost surely no eigenvalue in common and we can proceed as in 
the case without outliers. 



From now on, we assume that X„ satisfies Assumptions 12.11 and 12.51 and has pairwise 
distinct eigenvalues and that G satisfies Assumptions 12.21 and 12.61 and that its law is ab- 
solutely continuous with respect to Lebesgue measure. 



We first focus our attention to the function if" restricted to )C° „ and show the coun- 
terpart of Lemma 16. 7^ that is 

Lemma 9.5. Let e,p be fixed. There exists a positive integer no{e,p) and L{e) > such 
that for any n > UQ^e, p), for any z G /C°p, 



111=1 Iki^^nii 5'n(^)-R(^) YYu=ii^ ~ K) in the orth. perturb, model, 



H'^iz) 



s gn{z)R{z) YYiLii^ " K) in the i.i.d. perturb, model, 

(24) 

with L{e) <gn<j^ and u{gn) < j^. 

In particular, for any e > and p > small enough. 



limsuplimsup-logP f ( f\(z - X'T^H'^(z)] 



e {C'J'^^y = -oo. 



Proof. In this case, 

1=1 1=1 ^ * ^ j=p++l * i=m+p++l \ * 

The proof is exactly the same as in the sticking case once we have noticed that, from 
Assumption 12. 5[ there exists no{e,p) such that for n > no{e,p), HiLi (^1 ~ ^x"-'z ) — 2?+' 
so that 



Note that we could similarly show that for n > no(e,p) 



2 ' ' 



gn{z) < L{e) := j e^. (25) 

The uniform equicontinuity is also shown very similarly. □ 
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As in the sticking case, we have the analogue of Lemma 16.91 with L° instead of L. 
To state more precisely the lemma, we introduce the following notation: we denote by 
p) the set of ntuples (A" > ■ ■ ■ > A") such that for alH < m + p+ — /c, 

|A" - ai\<5 if Ui ^ {£} and |A" - ai\ < p if e {£} 

and for all m + p+ — /c + 1 < ? < m + p+ , 

K <h + e. 

Because of Lemma [93| Hn belong to the set of functions f{z) = h{z) YYiL^{z — ai)R{z) 
with a bounded positive constant h on 1C° ^ with values in [7,7^^] with overwhelming 
probability. But on this set also the zeroes are continuous function of the functions / 
and therefore we can proceed exactly as in the case without outliers. 

Lemma 9.6. Let a G and k between and m such that am+p+-k+i = ■ ■ ■ = am+p++i = 
b and a„i+p+_fc > b. We have 

lim lim lim sup lim sup — logP ((A", . . . , A") G G'fe(a, S, e, p)\ 

= limlimliminfliminf-logPf(A?,...,A") G Gk(a,S,e,p)) = -L°(a), 

eiO plO 510 n-s>oo n \ / 

with the obvious convention that nm+p+-fc+i<i<m+p+{-^" < b + e} = Q if k = 0. 
The proof is similar to the case without outliers. 



10. Application to X„ random, following some classical matrix 

distribution 



This section is devoted to the proofs of the results stated in Section 12.51 
10. L Proof of Theorem [21101 

Theorem 12. 101 is a slight extension of [H Theorem 2.6.6] and the proof will therefore follow 
the same lines. We introduce the notations (f){p,x) = —V{x) + P J log \x — y\dp{y) (for x 
greater or equal the right edge of the support of /i) and pn = Y17=p+i ■ Then 

P;),^(dAi,...,dA„) = 

nV/{n-p),l3 n TE_. .A,)+/3 y^, log I A.-A, | ipn-p A WAi ■ ■ ■ dA 

By [H Lemma 2.6.7], if parts i) and ii) of Assumption 12.91 hold, the law P^^ is expo- 
nentially tight so that it is enough to estimate the probability of a small ball around 
X = (xi > X2 > ■■■ > Xp) (with Xp > by), namely events of the form i?(x, 5) := 
{maxi<j<p \Xi — Xi\ < 5, max |Ai| < M}. 



As in [8], a crucial observation is the fact that pn converges to pv much faster than ex- 
ponentially under ^n'v/in-p) p (^^^ LDP is indeed in the scale n^). We can therefore replace 
0(/i'", Aj) by (f){pv,Xi), whereas the ratio of partition functions converges by hypothesis. 
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To be more precise, let us first sketch the proof of the upper bound. Note that there 
exists a constant $a/ such that on 5(x, 6), (p is bounded above by <I>a/ so that 

with Bfr{^v) a small ball with radius e around ixy for a distance compatible with the 
weak topology. As log P^yJ|-^_p^ ^(/t" G is bounded above by a negative real 

number for all 5 > by the LDP for the law of /i", the first term is negligible as n 
goes to infinity. Using the fact that {fi,x)^(j){fi,x) is upper continuous, we obtain the 
upper bound by first letting n go to infinity, then letting 6 decrease to zero and finally 
letting e go to zero. Notice again that in the proof of this upper bound, we use part i) of 
Assumption 12 . 91 to get the LDP for fin and ii) to control the ratio of the partition functions. 



The lower bound is similar to the proof in [H p. 84], which corresponds to p = 1. We 
proceed by induction on p and we can therefore assume that p is the smallest integer such 
that Xp > by- There exists xf,l<i<p, whose small neighbourhood are included in the 
6 neighbourhood of Xi,l < i < p, and which are distinct, so that for e small enough 

Pvflfnaax lAj — xA < 6) > P^^ofmax lAj — xf I < e, Xi < x^ — 6 — e, Vz > p) 

^'l^^lKiKp' ^'P^l<i<p' ^ 

^n-p 

> exp (n-p) inf P^v/Vp)^'^" ^ e)), 

^V,I3 \ \y,-xf\<s J l\ PI 

with _B[_M,2,p-5-e](Aty5 the set of probability measures in B^^iiy) with support in 
[-M,Xp - 6 - e\. When the distinct and away from their logarithmic in- 

teraction is negligible; moreover, part %%%) of Assumption 12.91 allows to claim that the last 
term in the lower bound above converges to one. We therefore get 

1 ^ 
liminf — logPy^(max |Aj — Xj| < 5) > — > Jvi^f) ~ otye 



i=l 



Now, Jy is continuous away from the support of /xy so that we can conclude by letting 6 
going to zero. 

Then to get the correct expression of the rate function, we just have to check that = 
pcxy^is, which is easy and left to the reader. □ 



10.2. Proof of Theorem _ 

As explained in Section |23| we have to study X„, when X„ is diagonal with eigenvalues 
having Py^ as their joint law and the f/j's obtained by orthonormalisation procedure from 
G = {gi, . . . ,gr) i.i.d. standard Gaussian. 

The proof will consist in first fixing the possible deviations of the extreme eigenvalues 
of Xn (hence providing outliers) and then, being given these outliers, computing the 
deviations of the eigenvalues of X„. The main point of course is that with exponentially 
large probability, only a finite number of eigenvalues of Xn can deviate. 
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• More precisely, we observe that, by Theorem 12.101 for all p G N*, the probability that 
Ap is greater than by + S is less than e-"P^('5) for e{S) = mf(^bv+s,+oo) Jv- The only point 
to check is that inf (h^+^ +oo) Jv > 0, which is a consequence of part Hi) of Assumption I2.9[ 

• The deviations of the eigenvalues of X„ are controlled by Theorem 12.101 : there exists 
£(?7, > so that for n large enough, for e < e{r], i) 

- JP(£i, . . . , £p) - r/ < - log P f max I Ar - £.1 < ^, K+i <bv + e]< -J'ih, 

(26) 

For all . . . , £p) and ?7 > 0, we define the set 

. . . = {(Ai, . . . , A„) e M" ; maxi<i<p < e{r]J),Xp+i < by + e{vJ)} ■ 

• Now, knowing the deviations of the eigenvalues of one can treat them as outliers 
and deal with the eigenvalues of the perturbed model. We have that, for any {ii, . . . ,£p) G 
{bv,+ooy and any r] > 0, there exists e{r],£),S{r],£) > so that for n large enough, for 
e < e{r],i), 6 < 6{r]J), 



maxi<i<p |A" - £i| <e, 
K+i <bv + e 



(27) 



These inequalities are a consequence of Theorem 19. II Indeed, let Xn be a matrix such that 
the event {maxi<j<p |A" — £j| < e} holds. Let be a real diagonal matrix with same 
eigenvalues of X„ except its k largest eigenvalues are equal to the outliers {ii, . . . , ip). 

Then we have ||X„ — X^||oo < so that, with obvious notations, ||X„ — X„ ||oo < e, so 

that the ordered eigenvalues of Xn and Xn differ at most by e. Thus, up to change 6 into 
6±e, Theorem [O gives ([27]). 

• We have now all the ingredients to prove the LDP. It is clear that since the largest 
eigenvalues of Xn are exponentially tight, so are the eigenvalues of Xn, and therefore 
it is enough to prove a weak large deviation principle. We let K{L) be such that the 
probability that A" or A" is greater than K{L) is smaller than e~"'^. 

• To prove the upper bound we can write, for any p > k, any 77 > 0, 5 > 0, 

P ( max |A" - < 5 1 < P (max I A" - xA < 6, A" , < by + 6] + e'^^'^^) (28) 

\^l<i<fc J \l<i<k J 

We fix ?7 > 0. As [by, K{L)Y is compact, from its infinite open covering UVrj{£i, . . . ,ip), one 
can always extract a finite covering Ui<s<m(»7)K;(-^i, • • • i -^p)- We then take S = min S{r], t) > 
0. Thus, we get by the LDP estimate (1261) 
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P( max |A" -Xi\<25] < e""^ + e-"^"^^) + V P I max |A" - xA < 25 n VU' . ..,£') 

\l<i<k J \l<i<k V 1 p/^ 



g t-j^ , . . . ,tp 



which gives the announced bound by taking first the hmit as n goes to infinity, then L,p 
to infinity and finally 6 and rj to zero. 

• The lower bound is easier as we simply write 

P( max < 25] > p( max I A" - xj < 25 nViit . . . ,tj 

\l<i<k J \l<i<k ' 

and use the large deviation theorems. □ 

11. Appendix 
11.1. Proof of a technical lemma. 



With the notations of Section 14. H we have the following result 
Lemma 11.1. Under Assumption \2/A for any 1 < Iq < r , we have 



limlimsup-logP( 

04-0 n-5-oo n 



4 



-oo. 



Proof. To simplify the notations, we shall assume that Iq = r. 

Recall that the G"'s were constructed from a family {G{k) = {gi{k), . . . ,gr{k))k>i of 
independent copies of G, via the formula := {gi{l), . . . , gi{n))^. For 1 < A;, we consider 
the random r x r Hermitian matrix 

Zk = GikyCik) = [gi{k)gj{k)],<i,,<r and ^" = - XI 

k=l 

By Cramer's Theorem [TB], we have that the law of satisfies a LDP with convex 
good rate function 

J(^)(y)= sup{(A,y)-A(A)}, 

where A(A) = logE(e^^''^^^) is exactly the function defined in Equation (jH]). 

Note that since for all n, is almost surely a positive semi-definite matrix, by closed- 
ness of the set of such matrices, the domain of / is contained in the set of positive 
semi-definite matrices. 
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Let Pr be the real polynomial function on introduced in Proposition 14.11 we have 
||g"iy"||2 = PriL"^)- Therefore, if, for any 5 > 0, we introduce the closed set £s := {y G 
Hf. ; Pr{y) < 5}, we have 

limsup-logP(||g;^W;"||^ < 6) < - inf I^^\y). 

n—^oo ^ y&^s 

Let us assume that 

M := lim inf I^^\y) < oo. 

Since I^^^ is a good rate function, there exists a compact set K such that inf j^gA'c j(^) (y) > 
M, so that for all 5 > 0, infyg^^. I^^\y) = miy^g^riK I^^^y)- Moreover the infimum on £s 
is reached : let, for all n > 0, be an element of K such that I^^\yn) = infyg^^ I^^\y). 

There exists a subsequence i^{n) such that y^(n) converges, as n goes to infinity to some y^. 
By continuity of Pr, Priyo) = linin^oo Pr{,yip{n)) = 0. It follows, by the last part of Propo- 
sition |1]T1 that t/o is not positive definite. However, since /*^^^ is lower semicontinuous, we 
have /(-^^(yo) < M < oo, which implies that y^ is a positive semi-definite matrix. Let p 
be the orthogonal projection onto ker yo- Note that p 7^ and that (p, yo) = Tr(?/oP) = 0. 

/(^)(2/o) = sup{(A,yo)-A(A)} 

> sup{(-tp, yo) - A(-tp)} 

= sup —A{—tp) 

t>o 

= +00 by (HI]), 

which yields a contradiction (as we already proved that I^^\yo) < M). 

Similarly, as I^^^ is a good rate function, it has compact level sets and therefore has to 
be large on the set {y : Priy) > l/*^}- Hence, 

limsuplimsup-logPdlg^'W;"!!^ > r^) = -00 

54,0 n— 5>oo 

which completes the proof of the lemma. □ 
11.2. On the eigenvalues of the deformed matrix. 



The goal of this section is to prove Lemma [9.31 In fact, we will prove the slightly more 
general 

Lemma 11.2. Let K be either M or C. Let us fix some positive integers n,r such that 
n > r, a self adjoint n x n real matrix X with eigenvalues Ai, . . . , A„, and some non null 
real numbers 6*1, . . . , 6*^. We make the following hypothesis: 

(H) Ai, . . . , A„ are pairwise distinct and there are pairwise distinct indices i 
{l,...,n} such that {A^^ + 6*1, ... , Ai^_, + 9r-i} n {Ai, . . . , A„} = 0. 

Let us define, for g = [gi, . . . ,gr] G K"^'' , 

Xg := X + 6iUiul + • • ■ + 6rUrUl, 
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where {ui, . . . , Ur) is either the orthonormalized family deduced from the columns of g by 
the Gram-Schmidt process or ^{gi, . . . , gr) ■ 

Then the Lebesgue measure of the set of the g 's such that Xg and X have at least one 
eigenvalue in common is null. 

Now, Lemma 19731 will be easy to deduce from the above. Indeed, one can check that for 
n large enough, Xn satisfies hypothesis (H). We know that its eigenvalues A^, . . . , are 
distinct. Moreover, let rj be such that rj < | mini<j<j. \9i\ and rj < |minj^j \ii — From 
Assumption 12.51 there exists n large enough so that X„ has at most p'^ eigenvalues greater 
than 6 + ?7, at most p~ eigenvalues smaller than a — r], more than 2r{p'^ + 1) eigenvalues 
in the interval [h — rj^h + rj) and more than 2r{p~ + 1) eigenvalues in {a — T],a + rj). 
Let us assume that 6i > 0. Then one can find an eigenvalue Aj^ among the p"*" + 1 greater 
ones in {b — r],b-\- rj) such that Aj^ + 9i do not belong to {A", . . . , \^}. We then forget the 
p^ + 1 greater eigenvalues and look at the p"*" + 1 following ones. Among them, one can 
find an eigenvalue Ajj such that A,,, + 02 do not belong to {A", . . . , A^}. and so on. For 
the negative ^^'s, we consider the p~ + 1 smallest eigenvalues in {a — ri,a + t]). 

We now prove Lemma 111.21 

Proof. The idea of the proof is the following. We shall first prove (in Step I) that the 
set of (?'s such that Xg and X have at least one eigenvalue in common is, up to a set of 
null Lebesgue measure, the set of zeroes of a polynomial function. Since it can easily be 
proved, by induction on the number of variables, that the set of zeroes of any non null 
polynomial in several real variables has null Lebesgue measure, proving (in Step II) that 
this function is not identically null will then imply that the set of such g^s has vanishing 
Lebesgue measure. 

Let P be either 1 or 2 according to whether IK is M or C. 

Step I. Let us first treat the case where (ui, . . . , Ur) = -^{gi, . . . , gr)- Let us define P 
to be the polynomial of ^nr real variables which maps [^fi , . . . , ^f^] G K"^'' to the resultant 
of the characteristic polynomials of X and Xg. The set of ^^'s in K"^^ such that X and 
Xg have an eigenvalue in common is exactly the set of g^s such that P{g) = : Step I is 
achieved in the case where (ui, . . . , u^) = -^{di, ■ ■ ■ ,gr)- 

Let us now treat the case where {ui, . . . ,Ur) is the orthonormalized family deduced 
from the columns of g by the Gram-Schmidt process. In this case, the resultant of 
the characteristic polynomials of X and of Xg is not anymore a polynomial function 
of the real coordinates of g, so we shall use the following trick. It can easily be noticed, 
through a careful look at the Gram-Schmidt process, that for all k E {1, . . . ,r}, for all 
i,j G {1,...,?7,}, there are two polynomial functions of Dk,Nk.i.j of /3nr real variables 
such that the i, j-th entry of Ukul is ^^'^(^g^^ and that Dk{g) is positive for any g G K"^'' 
which columns are linearly independent. Let us define the polynomial function of (3nr 
real variables 

r 

D{g) ■.= l[D,ig). 

k=l 
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For any g such that D{g) > (which is the case for any g G W^^^ which columns are 
hnearly independent), X and Xg have no eigenvalue in common if and only if D{g)X and 
D[g)Xg have no eigenvalue in common. Now, the advantage of having replaced X and Xg 
by D{g)X and D{g)Xg is that the entries of D{g)X and D{g)Xg are polynomial functions 
of g. Hence if one defines P{g) to be the resultant of the characteristic polynomials of 
D{g)X and D{g)Xg, P{g) is a polynomial function of the /3nr real coordinates of g and, 
up to the set (with zero Lebesgue measure) of g^s in K"^^ which columns are linearly 
independent, the set of g^s in K"^^ such that X and Xg have an eigenvalue in common is 
exactly the set of g^s such that P{g) = : Step I is achieved in the second case. 

Step II. Let us now prove that in both cases, the polynomial function g i — t- P{g) is 
not identically null. To treat both cases together, it suffices to prove that there exists 
g = [gi, . . . , gr] G K"'^'' with orthonormalized columns such that Xg and X have no 



eigenvalue in common. One can suppose that ii 
and that 

X ' 



. . . , Or—l 



r — 1, that Xr < ■ ■ ■ < Xr 



Xr. 



We shall choose the r — 1 first columns ^fi, . . . , gr-i of g to be the r — 1 first elements of 



the canonical basis and gr with null r 
choice of ^f, we have 



1 first coordinates and unit norm. With such a 



X„ 



X 



r-1 



Ar 



An 



9^- 



dr-grg* 



Let us suppose that > 0. It was shown in [211 Section 3.2] that as gr runs through the 
set of unit norm vectors of K"^^ with null r — 1 first coordinates, the ordered eigenvalues 
of the n — (r — l)xn — (r— 1) lower right block of Xg describe the set of families fir, . . . , fin 
of real numbers which sum up to Ar + ■ ■ ■ + A„ + ^r- and such that 

Xr < /ir ^ Ar+1 < " " " < A„ < 

One can easily find such a family fir, . . . ,fin. such that 

{/ir, . . . , n {Ai, . . . , A„} = 0, 
which concludes the proof, by hypothesis (H). □ 
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